[amd_dev] branch rebase #25753

HAIAI · 2025-09-26T08:22:02Z

Purpose

rebase/sync branch to latest main

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Nick Hill <[email protected]>

Signed-off-by: Lucas Kabela <[email protected]>

Signed-off-by: Max de Bayser <[email protected]>

…Prompt Embeds support (#25291) Signed-off-by: Andrew Sansom <[email protected]>

Signed-off-by: Andrew Sansom <[email protected]>

Signed-off-by: Boyuan Feng <[email protected]> Signed-off-by: Boyuan Feng <[email protected]> Signed-off-by: boyuanfeng <[email protected]> Co-authored-by: Luka Govedič <[email protected]>

Signed-off-by: Nick Hill <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Signed-off-by: Nick Hill <[email protected]>

…5289) Signed-off-by: Harry Mellor <[email protected]>

…utoGPTQ and AutoRound-GPTQ) (#25268) Signed-off-by: JartX <[email protected]>

Signed-off-by: Nick Hill <[email protected]>

…ion (#25298) Signed-off-by: Chendi Xue <[email protected]>

Signed-off-by: chaunceyjiang <[email protected]> Co-authored-by: xin.li <[email protected]>

…ng models (#25261) Signed-off-by: DarkLight1337 <[email protected]>

…25101) Signed-off-by: Chen Zhang <[email protected]>

Signed-off-by: Roger Wang <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]>

Signed-off-by: mgoin <[email protected]>

Signed-off-by: wwl2755 <[email protected]>

…speed (#23558) Signed-off-by: Manoel Marques <[email protected]> Signed-off-by: Manoel Marques <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Luka Govedič <[email protected]>

Signed-off-by: Isotr0py <[email protected]>

…erage (#25308) Signed-off-by: pengdrumli <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]>

) Signed-off-by: windsonsea <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]>

Signed-off-by: Huamin Li <[email protected]> Co-authored-by: Lu Fang <[email protected]>

gemini-code-assist

Code Review

This pull request appears to be a large rebase of a development branch, introducing a wide array of changes across the codebase. Key updates include a major refactoring of the CI/CD pipelines, deprecation of old benchmark scripts in favor of a new CLI, and the addition of numerous new benchmarks. On the feature side, there's new support for FP8 on ROCm, various new fused kernels for performance, and significant improvements to the CPU backend with oneDNN and scalar fallbacks. I've identified two critical correctness issues related to potential out-of-bounds memory access in MoE kernels, which I've detailed in the comments below. The rest of the changes, including new features and extensive refactoring, appear solid.

gemini-code-assist · 2025-09-26T08:24:53Z

csrc/moe/moe_align_sum_kernels.cu

    int expert_id = topk_ids[i];
+    if (expert_id >= num_experts) {
+      continue;
+    }


This bounds check on expert_id is a critical fix. Without it, an invalid ID from topk_ids could lead to an out-of-bounds atomic add on shared_counts, potentially causing memory corruption or a crash. This is an important safeguard for correctness and stability.

gemini-code-assist · 2025-09-26T08:24:53Z

csrc/moe/moe_align_sum_kernels.cu

    int32_t expert_id = topk_ids[i];
+    if (expert_id >= num_experts) {
+      continue;
+    }


Adding this bounds check for expert_id is crucial for correctness and security. It prevents potential out-of-bounds writes to cumsum_buffer and sorted_token_ids if topk_ids contains an invalid expert ID, which could otherwise lead to memory corruption or crashes.

tests/conftest.py

+        if ctype is None:
+            ctype = {"jpg": "image/jpg", "png": "image/png"}[ext]
+        self.send_response(200)
+        self.send_header("Content-Type", ctype)


To fix the potential HTTP response splitting vulnerability, sanitize or strictly validate any user-derived input used for constructing HTTP header values. Here, before using ctype as a header value, we should ensure that it contains no CR, LF, or colon characters that could allow header injection. We can do this by stripping or replacing these characters if present. It's also prudent to be defensive, as a belt-and-braces approach; even though in this code only jpg/png extensions are accepted, we should sanitize just in case. The fix will:

Strip/disallow \r, \n, and : characters from ctype before it is used in send_header.

Do this immediately before using ctype on line 1173.

No new methods or imports are necessary, as simple string methods suffice for this sanitization.

…#25698) Signed-off-by: Sage Moore <[email protected]> Co-authored-by: Robert Shaw <[email protected]>

Signed-off-by: 许文卿 <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]>

Signed-off-by: Chih-Chieh-Yang <[email protected]> Co-authored-by: RishiAstra <[email protected]>

Signed-off-by: chaunceyjiang <[email protected]>

Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]>

yewentao256

Sorry what is this pr for? Do we need to land it? Please give more context in the description

HAIAI · 2025-09-26T15:43:36Z

@yewentao256 try to land it.

yewentao256

Sorry, I didn't realize that this is from main to dev, not dev to main.

…25455) Signed-off-by: Isotr0py <[email protected]>

njhill and others added 30 commits September 19, 2025 16:34

[BugFix] Fix async scheduling CPU tensor race take 2 (#25279)

14c1432

Signed-off-by: Nick Hill <[email protected]>

[Bugfix] Remove VLLM_TEST_DYNAMO_FULLGRAPH_CAPTURE #2969 (#25090)

3da17c2

Signed-off-by: Lucas Kabela <[email protected]>

Don't skip special tokens with hermes-style tool calling (#25281)

a36c675

Signed-off-by: Max de Bayser <[email protected]>

test: Remove vestigial skip for prompt embeds tests after landing v1 …

c7e7136

…Prompt Embeds support (#25291) Signed-off-by: Andrew Sansom <[email protected]>

[docs] Prompt Embedding feature support (#25288)

b8a287a

Signed-off-by: Andrew Sansom <[email protected]>

[torch.compile] CUDAGraph Inductor partition integration (#24281)

8945b00

Signed-off-by: Boyuan Feng <[email protected]> Signed-off-by: Boyuan Feng <[email protected]> Signed-off-by: boyuanfeng <[email protected]> Co-authored-by: Luka Govedič <[email protected]>

[BugFix] Ensure appropriate guards in destructors (#25284)

a25ade5

Signed-off-by: Nick Hill <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

[Misc] Support more collective_rpc return types (#25294)

535d800

Signed-off-by: Nick Hill <[email protected]>

Improve weight loading for encoder models in Transformers backend (#2…

c308501

…5289) Signed-off-by: Harry Mellor <[email protected]>

[BUGFIX] GPTQ quantization compatibility for Qwen3 Next MOE models (A…

3642909

…utoGPTQ and AutoRound-GPTQ) (#25268) Signed-off-by: JartX <[email protected]>

[BugFix] Exclude self when checking for port collision (#25286)

b7f186b

Signed-off-by: Nick Hill <[email protected]>

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attent…

6c5f82e

…ion (#25298) Signed-off-by: Chendi Xue <[email protected]>

[Bugfix] fix tool call arguments is empty (#25223)

f91480b

Signed-off-by: chaunceyjiang <[email protected]> Co-authored-by: xin.li <[email protected]>

[Optimization] Avoid repeated model architecture conversion for pooli…

c60e613

…ng models (#25261) Signed-off-by: DarkLight1337 <[email protected]>

[Hybrid Allocator] Support full attention with different hidden size (#…

9607d5e

…25101) Signed-off-by: Chen Zhang <[email protected]>

[Bugfix] Fix Qwen3-VL-MoE weight loading for EP (#25300)

be874c0

Signed-off-by: Roger Wang <[email protected]>

[V1] Support LLM.apply_model (#18465)

3d9a1d2

Signed-off-by: DarkLight1337 <[email protected]>

[CI Failure] Disable FlashInfer RoPE to unblock CI (#25299)

e08a3a3

Signed-off-by: mgoin <[email protected]>

[Docs] Fix warnings in mkdocs build (continued) (#25042)

032d661

Signed-off-by: wwl2755 <[email protected]>

[Model] Cleanup InternViT's data parallel implementation (#25306)

3c713a9

Signed-off-by: Isotr0py <[email protected]>

[Core] Enable sharded state loader for V1 engine and enhance test cov…

d88918e

…erage (#25308) Signed-off-by: pengdrumli <[email protected]>

[V0 Deprecation] Enable the remaining multimodal tests in V1 (#25307)

bef180f

Signed-off-by: DarkLight1337 <[email protected]>

[Docs] Fix warnings in vllm/profiler and vllm/transformers_utils (#25220

367a480

) Signed-off-by: windsonsea <[email protected]>

[V0 Deprecation] Remove LLMEngine (#25033)

52c2a8d

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]>

[V0 Deprecation] Remove V0 Output Processor (#25320)

86647d1

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]>

[Chore] Remove unused sampler in models (#25324)

572ddf8

Signed-off-by: Woosuk Kwon <[email protected]>

[CI] Skip tests failing on main (#25326)

72dd159

Signed-off-by: Woosuk Kwon <[email protected]>

[V0 Deprecation] Remove V0 core (#25321)

c99db8c

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]>

[Doc] improve test-pipeline.yaml documentation (#25305)

62b38dc

Signed-off-by: Huamin Li <[email protected]> Co-authored-by: Lu Fang <[email protected]>

mergify bot added structured-output speculative-decoding labels Sep 26, 2025

github-project-automation bot added this to Structured Output Sep 26, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Sep 26, 2025

mergify bot added v1 tpu Related to Google TPUs tool-calling labels Sep 26, 2025

github-project-automation bot added this to Tool Calling Sep 26, 2025

mergify bot added the kv-connector label Sep 26, 2025

gemini-code-assist bot reviewed Sep 26, 2025

View reviewed changes

github-advanced-security bot found potential problems Sep 26, 2025

View reviewed changes

SageMoore and others added 8 commits September 26, 2025 01:25

[Bugfix] Fix Shared Expert/Zero expert code in FusedMoE.process_chunk (…

dfb9af2

…#25698) Signed-off-by: Sage Moore <[email protected]> Co-authored-by: Robert Shaw <[email protected]>

Support LongCat-Flash-Chat tool call (#24083)

b03b1b9

Signed-off-by: 许文卿 <[email protected]>

[Doc] Update Batch-level DP docs (#25757)

633f943

Signed-off-by: DarkLight1337 <[email protected]>

[Model] Mamba2 varlen refactor (#21467)

2b6b1d7

Signed-off-by: Chih-Chieh-Yang <[email protected]> Co-authored-by: RishiAstra <[email protected]>

[CI] Fix test_shared_storage_connector_hashes (#25748)

2827b3f

Signed-off-by: chaunceyjiang <[email protected]>

[Bugfix] Properly abort pooling request. (#25734)

fe6b19c

Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

[CI/Build] Split up Distributed Tests (#25572)

bc9d7b5

Signed-off-by: DarkLight1337 <[email protected]>

[CI/Build] Fix some V1 tests not being run (#25569)

db1e42f

Signed-off-by: DarkLight1337 <[email protected]>

yewentao256 requested changes Sep 26, 2025

View reviewed changes

github-project-automation bot moved this from To Triage to In progress in gpt-oss Issues & Enhancements Sep 26, 2025

yewentao256 approved these changes Sep 26, 2025

View reviewed changes

github-project-automation bot moved this from In progress to Ready in gpt-oss Issues & Enhancements Sep 26, 2025

[Quantization] Add field to skip unquantized modules for GPTQ config (#…

d4d9899

…25455) Signed-off-by: Isotr0py <[email protected]>

HAIAI added the ready-for-merge Indicate this PR is ready to be merged by the maintainers, used by reviewers without merge access. label Sep 26, 2025

hmellor merged commit aee7633 into amd_dev Sep 26, 2025
17 of 26 checks passed

github-project-automation bot moved this to Done in Structured Output Sep 26, 2025

github-project-automation bot moved this to Done in Tool Calling Sep 26, 2025

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Sep 26, 2025

@@ -1170,7 +1170,8 @@
                     if ctype is None:
                         ctype = {"jpg": "image/jpg", "png": "image/png"}[ext]
                     self.send_response(200)
-                    self.send_header("Content-Type", ctype)
+                    safe_ctype = ctype.replace("\n", "").replace("\r", "").replace(":", "")
+                    self.send_header("Content-Type", safe_ctype)
                     self.send_header("Content-Length", str(len(data)))
                     self.end_headers()
                     self.wfile.write(data)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[amd_dev] branch rebase #25753

[amd_dev] branch rebase #25753

HAIAI commented Sep 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Sep 26, 2025

Uh oh!

gemini-code-assist bot Sep 26, 2025

Uh oh!

Check warning

Copilot Autofix

yewentao256 left a comment

Uh oh!

HAIAI commented Sep 26, 2025

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[amd_dev] branch rebase #25753

[amd_dev] branch rebase #25753

Conversation

HAIAI commented Sep 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Check warning

Uh oh!

Copilot Autofix

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

HAIAI commented Sep 26, 2025

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

HAIAI commented Sep 26, 2025 •

edited by github-actions bot

Loading